音频是人类交流最常用的方式之一,但与此同时,它很容易被欺骗人们滥用。随着AI的革命,几乎每个人都可以访问相关技术,从而使罪犯犯罪和伪造变得简单。在这项工作中,我们引入了一种深度学习方法,以开发一种分类器,该分类器将盲目地将输入音频分类为真实或模仿。提出的模型接受了从大型音频数据集提取的一组重要功能的培训,以获取分类器,该分类器已在不同音频的相同功能上进行了测试。为这项工作创建了两个数据集;所有英语数据集和混合数据集(阿拉伯语和英语)。这些数据集已通过GitHub提供,可在https://github.com/sass7/dataset上使用研究社区。为了进行比较,还通过人类检查对音频进行了分类,主题是母语人士。随之而来的结果很有趣,并且表现出强大的精度。
translated by 谷歌翻译
在本文中,我们报告了通过眼动的性别预测的第一个稳定结果。我们使用带有面部图像的数据集作为刺激和大量370名参与者。稳定性对我们有两种含义:首先,我们能够估计单个预测实验的标准偏差(SD)(约为4.1%);这是通过改变参与者人数来实现的。其次,我们能够提供具有非常低标准误差的平均准确性(SEM):我们的精度为65.2%,SEM为0.80%;这是通过许多随机选择预测的训练和测试集来实现的。我们的研究表明,两个特定的分类器达到了最佳精度:随机森林和逻辑回归。我们的结果重新确认了先前的发现,即女性对刺激的左眼更有偏见。
translated by 谷歌翻译
知识图嵌入模型已成为机器学习的重要领域。这些模型在知识图中提供了实体和关系的潜在表示,然后可以在下游机器学习任务(例如链接预测)中使用。这些模型的学习过程可以通过对比正面和负三元组来执行。虽然所有千克的三元组都被认为是正的,但负三元三联通常不容易获得。因此,获得的采样方法的选择在知识图嵌入模型的性能和有效性中起着至关重要的作用。当前的大多数方法从基础知识图中实体的随机分布中获取负面样本,这些样本通常还包括毫无意义的三元组。其他已知方法使用对抗技术或生成神经网络,从而降低了过程的效率。在本文中,我们提出了一种方法,以产生有关实体的可用互补知识的信息负面样本。特别是,预训练的语言模型用于通过利用实体之间的距离来形成邻里群集,以通过其文本信息获得符号实体的表示。我们的全面评估证明了拟议方法在基准知识图上具有链接预测任务的文本信息的有效性。
translated by 谷歌翻译
几项研究报告说,基于眼球运动特性的生物识别识别可用于认证。本文基于乔治和布线最初提出的方法的改进版本,通过跨多个数据集进行广泛研究用户识别。我们分析了对影响识别准确性的几个因素的方法,例如刺激类型,IVT参数(用于将轨迹分段为固定和扫视),添加新功能,例如眼球运动的高阶衍生物,包含眨眼信息,模板老化,年龄和性别。我们发现三种方法即选择最佳IVT参数,添加高阶导数特征,包括额外的眨眼分类器对识别准确性产生正影响。改进范围从几个百分点到一个数据集中的令人印象深刻的9%增加。
translated by 谷歌翻译
Any organization needs to improve their products, services, and processes. In this context, engaging with customers and understanding their journey is essential. Organizations have leveraged various techniques and technologies to support customer engagement, from call centres to chatbots and virtual agents. Recently, these systems have used Machine Learning (ML) and Natural Language Processing (NLP) to analyze large volumes of customer feedback and engagement data. The goal is to understand customers in context and provide meaningful answers across various channels. Despite multiple advances in Conversational Artificial Intelligence (AI) and Recommender Systems (RS), it is still challenging to understand the intent behind customer questions during the customer journey. To address this challenge, in this paper, we study and analyze the recent work in Conversational Recommender Systems (CRS) in general and, more specifically, in chatbot-based CRS. We introduce a pipeline to contextualize the input utterances in conversations. We then take the next step towards leveraging reverse feature engineering to link the contextualized input and learning model to support intent recognition. Since performance evaluation is achieved based on different ML models, we use transformer base models to evaluate the proposed approach using a labelled dialogue dataset (MSDialogue) of question-answering interactions between information seekers and answer providers.
translated by 谷歌翻译
A large number of empirical studies on applying self-attention models in the domain of recommender systems are based on offline evaluation and metrics computed on standardized datasets, without insights on how these models perform in real life scenarios. Moreover, many of them do not consider information such as item and customer metadata, although deep-learning recommenders live up to their full potential only when numerous features of heterogeneous types are included. Also, typically recommendation models are designed to serve well only a single use case, which increases modeling complexity and maintenance costs, and may lead to inconsistent customer experience. In this work, we present a reusable Attention-based Fashion Recommendation Algorithm (AFRA), that utilizes various interaction types with different fashion entities such as items (e.g., shirt), outfits and influencers, and their heterogeneous features. Moreover, we leverage temporal and contextual information to address both short and long-term customer preferences. We show its effectiveness on outfit recommendation use cases, in particular: 1) personalized ranked feed; 2) outfit recommendations by style; 3) similar item recommendation and 4) in-session recommendations inspired by most recent customer actions. We present both offline and online experimental results demonstrating substantial improvements in customer retention and engagement.
translated by 谷歌翻译
Over the past years, fashion-related challenges have gained a lot of attention in the research community. Outfit generation and recommendation, i.e., the composition of a set of items of different types (e.g., tops, bottom, shoes, accessories) that go well together, are among the most challenging ones. That is because items have to be both compatible amongst each other and also personalized to match the taste of the customer. Recently there has been a plethora of work targeted at tackling these problems by adopting various techniques and algorithms from the machine learning literature. However, to date, there is no extensive comparison of the performance of the different algorithms for outfit generation and recommendation. In this paper, we close this gap by providing a broad evaluation and comparison of various algorithms, including both personalized and non-personalized approaches, using online, real-world user data from one of Europe's largest fashion stores. We present the adaptations we made to some of those models to make them suitable for personalized outfit generation. Moreover, we provide insights for models that have not yet been evaluated on this task, specifically, GPT, BERT and Seq-to-Seq LSTM.
translated by 谷歌翻译
与其2D图像对应物相比,3D点云数据上的零射击学习是一个相关的未置换问题。 3D数据由于不可用的预训练特征提取模型而带来了ZSL的新挑战。为了解决这个问题,我们提出了一种及时引导的3D场景生成和监督方法,该方法可以增强3D数据以更好地学习网络,从而探索可见和看不见的对象的复杂相互作用。首先,我们以提示描述的某些方式合并了两个3D模型的点云。提示的行为就像描述每个3D场景的注释一样。后来,我们进行对比学习,以端到端的方式培训我们所提出的建筑。我们认为,与单​​个对象相比,3D场景可以更有效地关联对象,因为当对象出现在上下文中时,流行的语言模型(如Bert)可以实现高性能。我们提出的及时引导场景生成方法封装了数据扩展和基于及时的注释/字幕,以提高3D ZSL性能。我们已经在合成(ModelNet40,ModelNet10)和实扫描(ScanoJbectnn)3D对象数据集上实现了最新的ZSL和广义ZSL性能。
translated by 谷歌翻译
有效的有丝分裂定位是决定肿瘤预后和成绩的关键先驱任务。由于固有的域偏见,通过深度学习的图像分析通过深度学习图像分析的自动化检测通常会失败。本文提出了一个用于有丝分裂检测的域均质器,该域均质器试图通过输入图像的对抗重建来减轻组织学图像的领域差异。拟议的均质器基于U-NET架构,可以有效地减少组织学成像数据常见的域差异。我们通过观察预处理图像之间的域差异来证明我们的域均质器的有效性。使用此均匀剂,以及随后的视网膜网络检测器,我们能够以检测到的有丝分裂数字的平均精度来超越2021 MIDOG挑战的基准。
translated by 谷歌翻译
近年来,由于深度学习体系结构的有希望的进步,面部识别系统取得了非凡的成功。但是,当将配置图像与额叶图像的画廊匹配时,它们仍然无法实现预期的准确性。当前方法要么执行姿势归一化(即额叶化)或脱离姿势信息以进行面部识别。相反,我们提出了一种新方法,通过注意机制将姿势用作辅助信息。在本文中,我们假设使用注意机制姿势参加的信息可以指导剖面面上的上下文和独特的特征提取,从而进一步使嵌入式域中的更好表示形式学习。为了实现这一目标,首先,我们设计了一个统一的耦合曲线到额定面部识别网络。它通过特定于类的对比损失来学习从面孔到紧凑的嵌入子空间的映射。其次,我们开发了一个新颖的姿势注意力块(PAB),以专门指导从剖面面上提取姿势 - 不合稳定的特征。更具体地说,PAB旨在显式地帮助网络沿着频道和空间维度沿着频道和空间维度的重要特征,同时学习嵌入式子空间中的歧视性但构成不变的特征。为了验证我们提出的方法的有效性,我们对包括多PIE,CFP,IJBC在内的受控和野生基准进行实验,并在艺术状态下表现出优势。
translated by 谷歌翻译